Information filtering based on wiki index database
نویسندگان
چکیده
In this paper we present a profile-based approach to information filtering by an analysis of the content of text documents. The Wikipedia index database is created and used to automatically generate the user profile from the user’s document collection. The problem-oriented Wikipedia subcorpora are created (using knowledge extracted from the user profile) for each topic of user interests. The index databases of these subcorpora are applied to filtering information flow (e.g., mails, news). Thus, the analyzed texts are classified into several topics explicitly presented in the user profile. The paper concentrates on the indexing part of the approach. The architecture of an application implementing the Wikipedia indexing is described. The indexing method is evaluated using the Russian and Simple English Wikipedia.
منابع مشابه
Index wiki database: design and experiments
With the fantastic growth of Internet usage, information search in documents of a special type called a “wiki page” that is written using a simple markup language, has become an important problem. This paper describes the software architectural model for indexing wiki texts in three languages (Russian, English, and German) and the interaction between the software components (GATE, Lemmatizer, a...
متن کاملTowards an Inquiry-Based Language Learning: Can a Wiki Help?
Wiki use may help EFL instructors to create an effective learning environment for inquiry-based language teaching and learning. The purpose of this study was to investigate the effects of wikis on the EFL learners’ IBL process. Forty-nine EFL students participated in the study while they conducted research projects in English. The Non-wiki group (n = 25) received traditional inquiry instr...
متن کاملAutomatic Population and Updating of a Semantic Wiki-based Configuration Management Database
This paper describes our work on designing and implementing a component for automatically integrating and updating information about configuration items into a Semantic Wiki-based configuration management database. The presented solution uses technology for information gathering which is built-in or available for most current mainstream operating systems. By using Semantic Wiki technology, e.g....
متن کاملA New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation
Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...
متن کاملEvaluating the status of agricultural articles of Iranian researchers at the Scopus citation database based on the Hirsch index
Background and aim: Today, the use of scientometric methods to evaluate the scientific outputs of researchers in various fields has been highly regarded and the Hirsch index (h-index) is one of the most important scientometric indices due to the simultaneous measurement of quantity and quality of scientific outputs. Therefore, the aim of this study was to evaluate the status of Iranian agricult...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0804.2354 شماره
صفحات -
تاریخ انتشار 2008